CUR Decompositions, Similarity Matrices, and Subspace Clustering
نویسندگان
چکیده
A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces U = M ⋃ i=1 Si. The similarity matrices thus constructed give the exact clustering in the noise-free case. A simple adaptation of the technique also allows clustering of noisy data. Two known methods for subspace clustering can be derived from the CUR technique. Experiments on synthetic and real data are presented to test the method.
منابع مشابه
Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods
Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m×n matrix A, decompose it as a product of three matrices, C, U , and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that th...
متن کاملDeterministic CUR for Improved Large-Scale Data Analysis: An Empirical Study
Low-rank approximations which are computed from selected rows and columns of a given data matrix have attracted considerable attention lately. They have been proposed as an alternative to the SVD because they naturally lead to interpretable decompositions which was shown to be successful in application such as fraud detection, fMRI segmentation, and collaborative filtering. The CUR decompositio...
متن کاملSpectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method
The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel spectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds...
متن کاملRSVDPACK: An implementation of randomized algorithms for computing the singular value, interpolative, and CUR decompositions of matrices on multi-core and GPU architectures
RSVDPACK is a library of functions for computing low rank approximations of matrices. The library includes functions for computing standard (partial) factorizations such as the Singular Value Decomposition (SVD), and also so called “structure preserving” factorizations such as the Interpolative Decomposition (ID) and the CUR decomposition. The ID and CUR factorizations pick subsets of the rows/...
متن کاملElastic Net subspace clustering applied to pop/rock music structure analysis
A novel homogeneity-based method for music structure analysis is proposed. The heart of the method is a similarity measure, derived from first principles, that is based on the matrix elastic net (EN) regularization and deals efficiently with highly correlated audio feature vectors. In particular, beatsynchronous mel-frequency cepstral coefficients, chroma features, and auditory temporal modulat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.04178 شماره
صفحات -
تاریخ انتشار 2017